3 research outputs found

    A Few-Shot Approach to Dysarthric Speech Intelligibility Level Classification Using Transformers

    Full text link
    Dysarthria is a speech disorder that hinders communication due to difficulties in articulating words. Detection of dysarthria is important for several reasons as it can be used to develop a treatment plan and help improve a person's quality of life and ability to communicate effectively. Much of the literature focused on improving ASR systems for dysarthric speech. The objective of the current work is to develop models that can accurately classify the presence of dysarthria and also give information about the intelligibility level using limited data by employing a few-shot approach using a transformer model. This work also aims to tackle the data leakage that is present in previous studies. Our whisper-large-v2 transformer model trained on a subset of the UASpeech dataset containing medium intelligibility level patients achieved an accuracy of 85%, precision of 0.92, recall of 0.8 F1-score of 0.85, and specificity of 0.91. Experimental results also demonstrate that the model trained using the 'words' dataset performed better compared to the model trained on the 'letters' and 'digits' dataset. Moreover, the multiclass model achieved an accuracy of 67%.Comment: Paper has been presented at ICCCNT 2023 and the final version will be published in IEEE Digital Library Xplor

    A Visibility Graph Approach for Multi-Stage Classification of Parkinson’s Disease Using Multimodal Data

    No full text
    Parkinson’s disease (PD) is a neurodegenerative disorder characterized by several motor symptoms such as resting tremor, muscular rigidity, slowness of movement and different speech impairments. PD is a kind of singular, multi-system disorder that gradually worsen over the time. In this research, we classify the neurological state of the patients with Parkinson’s disease (PwPD) according to the third section of the Movement Disorders Society - Unified Parkinson’s Disease Rating Scale (MDS-UPDRS-III) using multimodal bio-signal data. As PD advances from low to advanced state, PwPD finds it difficult in their speech production and irregularities in their gait patterns. Monitoring the chaotic nature of time series data corresponding to speech and gait biomarkers can provide insights into the progression of the condition across different stages. This work for the first time analyze PD in a complex system perspective while representing the biomarkers as complex networks. The time-series corresponding to speech and gait signals are represented separately, as a complex network using the visibility graph algorithm. The characterization of the different stages of PD is explored for each modalities using different network features. Performance evaluation shows that the results obtained using the multimodal configuration of speech and gait left foot signals outperform the state-of the-art method. Moreover, performance comparison with the unimodal counterparts proves the need for multimodal assessment of PD severity. The configuration ‘speech and gait left foot’ outperforms (in terms of accuracy) that of the unimodal by 32% in speech, 3% in gait left foot, 19% in gait right foot, and 3% in gait both feet
    corecore